The assessment of pollution exposure is based on the analysis of multivariate time series that include the concentrations of several pollutants as well as the measurements of multiple atmospheric variables. It typically requires methods of dimensionality reduction that are capable to identify potentially dangerous combinations of pollutants and, simultaneously, to segment exposure periods according to air quality conditions. When the data are high-dimensional, however, efficient methods of dimensionality reduction are challenging because of the formidable structure of cross-correlations that arise from the dynamic interaction between weather conditions and natural/anthropogenic pollution sources. In order to assess pollution exposure in an urban area while taking the above mentioned difficulties into account, we develop a class of parsimonious hidden Markov models. In a multivariate time-series setting, this approach allows to simultaneously perform temporal segmentation and dimensionality reduction. We specifically approximate the distribution of multiple pollutant concentrations by mixtures of factor analysis models, whose parameters evolve according to a latent Markov chain. Covariates are included as predictors of the chain transition probabilities. Parameter constraints on the factorial component of the model are exploited to tune the flexibility of dimensionality reduction. In order to estimate the model parameters efficiently, we propose a novel three-step Alternating Expected Conditional Maximization (AECM) algorithm, which is also assessed in a simulation study. In the case study, the proposed methods were capable (1) to describe the exposure to pollution in terms of a few latent regimes, (2) to associate these regimes with specific combinations of pollutant concentration levels as well as distinct correlation structures between concentrations, and (3) to capture the influence of weather conditions on transitions between regimes

DYNAMIC MIXTURES OF FACTOR ANALYZERS TO CHARACTERIZE MULTIVARIATE AIR POLLUTANT EXPOSURES / Maruotti, Antonello; Bulla, Jan; Lagona, Francesco; Picone, Marco; Martella, Francesca. - In: THE ANNALS OF APPLIED STATISTICS. - ISSN 1932-6157. - 11:3(2017), pp. 1617-1648.

DYNAMIC MIXTURES OF FACTOR ANALYZERS TO CHARACTERIZE MULTIVARIATE AIR POLLUTANT EXPOSURES

MARTELLA, Francesca
2017

Abstract

The assessment of pollution exposure is based on the analysis of multivariate time series that include the concentrations of several pollutants as well as the measurements of multiple atmospheric variables. It typically requires methods of dimensionality reduction that are capable to identify potentially dangerous combinations of pollutants and, simultaneously, to segment exposure periods according to air quality conditions. When the data are high-dimensional, however, efficient methods of dimensionality reduction are challenging because of the formidable structure of cross-correlations that arise from the dynamic interaction between weather conditions and natural/anthropogenic pollution sources. In order to assess pollution exposure in an urban area while taking the above mentioned difficulties into account, we develop a class of parsimonious hidden Markov models. In a multivariate time-series setting, this approach allows to simultaneously perform temporal segmentation and dimensionality reduction. We specifically approximate the distribution of multiple pollutant concentrations by mixtures of factor analysis models, whose parameters evolve according to a latent Markov chain. Covariates are included as predictors of the chain transition probabilities. Parameter constraints on the factorial component of the model are exploited to tune the flexibility of dimensionality reduction. In order to estimate the model parameters efficiently, we propose a novel three-step Alternating Expected Conditional Maximization (AECM) algorithm, which is also assessed in a simulation study. In the case study, the proposed methods were capable (1) to describe the exposure to pollution in terms of a few latent regimes, (2) to associate these regimes with specific combinations of pollutant concentration levels as well as distinct correlation structures between concentrations, and (3) to capture the influence of weather conditions on transitions between regimes
2017
Hidden Markov models; AECM algorithm; dimensionality reduction, three-step algorithm
01 Pubblicazione su rivista::01a Articolo in rivista
DYNAMIC MIXTURES OF FACTOR ANALYZERS TO CHARACTERIZE MULTIVARIATE AIR POLLUTANT EXPOSURES / Maruotti, Antonello; Bulla, Jan; Lagona, Francesco; Picone, Marco; Martella, Francesca. - In: THE ANNALS OF APPLIED STATISTICS. - ISSN 1932-6157. - 11:3(2017), pp. 1617-1648.
File allegati a questo prodotto
File Dimensione Formato  
Maruotti_Dynamic-mixtures_2017.pdf

solo gestori archivio

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.42 MB
Formato Adobe PDF
1.42 MB Adobe PDF   Contatta l'autore
Maruotti_dynamic-mixtures_2017.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.34 MB
Formato Adobe PDF
2.34 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/952297
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 12
social impact